Recovering from Selection Bias using Marginal Structure in Discrete Models
نویسندگان
چکیده
This paper considers the problem of inferring a discrete joint distribution from a sample subject to selection. Abstractly, we want to identify a distribution p(x,w) from its conditional p(x |w). We introduce new assumptions on the marginal model for p(x), under which generic identification is possible. These assumptions are quite general and can easily be tested; they do not require precise background knowledge of p(x) or p(w), such as proportions estimated from previous studies. We particularly consider conditional independence constraints, which often arise from graphical and causal models, although other constraints can also be used. We show that generic identifiability of causal effects is possible in a much wider class of causal models than had previously been known.
منابع مشابه
Estimation of Structural Parameters and Marginal Effects in Binary Choice Panel Data Models with Fixed Effects
Fixed effects estimates of structural parameters in nonlinear panel models can be severely biased due to the incidental parameters problem. In this paper I show that the most important component of this incidental parameters bias for probit fixed effects estimators of index coefficients is proportional to the true parameter value, using a large-T expansion of the bias. This result allows me to ...
متن کاملAvoiding selection bias : A unified treatment of thresholded data
When searching for populations of rare and/or weak signals in noisy data, it is common to use a detection threshold to remove marginal events which are unlikely to be the signals of interest; or a detector might have limited sensitivity, causing it to not detect some of the population. In both cases a selection of data has occurred, which can potentially bias any inferences drawn from the remai...
متن کاملBayesian Analysis of Multivariate Sample Selection Models Using Gaussian Copulas
We consider the Bayes estimation of a multivariate sample selection model with p pairs of selection and outcome variables. Each of the variables may be discrete or continuous with a parametric marginal distribution, and their dependence structure is modeled through a Gaussian copula function. Markov chain Monte Carlo methods are used to simulate from the posterior distribution of interest. The ...
متن کاملBayesian Network Learning with Discrete Case-Control Data
We address the problem of learning Bayesian networks from discrete, unmatched case-control data using specialized conditional independence tests. Those tests can also be used for learning other types of graphical models or for feature selection. We also propose a post-processing method that can be applied in conjunction with any Bayesian network learning algorithm. In simulations we show that o...
متن کاملBayesian Sample size Determination for Longitudinal Studies with Continuous Response using Marginal Models
Introduction Longitudinal study designs are common in a lot of scientific researches, especially in medical, social and economic sciences. The reason is that longitudinal studies allow researchers to measure changes of each individual over time and often have higher statistical power than cross-sectional studies. Choosing an appropriate sample size is a crucial step in a successful study. A st...
متن کامل